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Abstract 

We present a numerical method to solve the optimal transport problem with 
a quadratic cost when the source and target measures are periodic probability 
densities. This method is based on a numerical resolution of the correspond- 
ing Monge-Ampere equation. We extend the damped Newton algorithm of 
Loeper and Rapetti [16] to the more general case of a non uniform density 
which is relevant to the optimal transport problem, and we show that our al- 
gorithm converges for sufficiently large damping coefficients. The main idea 
consists of designing an iterative scheme where the fully nonlinear equation is 
approximated by a non-constant coefficient linear elliptic PDE that we solve 
numerically. We introduce several improvements and some new techniques 
for the numerical resolution of the corresponding linear system. Namely, we 



use a Fast Fourier Transform (FFT) method by Strain [24(, which allows 
to increase the efficiency of our algorithm against the standard finite differ- 
ence method. Moreover, we use a fourth order finite difference scheme to 
approximate the partial derivatives involved in the nonlinear terms of the 
Newton algorithm, which are evaluated once at each iteration; this leads to 
a significant improvement of the accuracy of the method, but does not sac- 
rifice its efficiency. Finally, we present some numerical experiments which 
demonstrate the robustness and efficiency of our method on several exam- 
ples of image processing, including an application to multiple sclerosis disease 
detection. 
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1. Introduction 

The optimal transport problem, also known as the Monge-Kantorovich 



problem, originated from a famous engineering problem by Monge 18] for 
which Kantorovich produced a more tractable relaxed formulation jl4]. This 
problem deals with the optimal way of allocating resources from one site to 
another while keeping the cost of transportation minimal. Formally, if /z is a 
probability measure modelling the distribution of material in the source do- 
main X C R d , and v is another probability measure modelling the structure 
of the target domain Y C M. d , the Monge-Kantorovich problem consists of 
finding the optimal transport plan T in 

where c(x — y) denotes the cost of transporting a unit mass of material 
from a position x G X to a location y £ Y, and T#/x = v means that 
p{B) = /i (T _1 (.B)) for all Borel sets Y C M n , that is, the quantity of mate- 
rial supplied in a region B of Y coincides with the total amount of material 
transported from the region T~ 1 {B) of X via the transport plan T. When 
the cost function is quadratic, i.e. c(x — y) = \x — y\ 2 /2, the corresponding 
optimal transport problem in known as the L 2 optimal transport problem. 
This particular case has attracted many researchers in the past few decades, 
and a lot of interesting theoretical results have been obtained along with 
several applications in science and engineering, such as meteorology, fluid 
dynamics and mathematical economics. We refer to the recent monographs 



of Villani [27|, |28| for an account on these developments. One of the most 
important results concerns the form of the solution to the L 2 optimal trans- 
port problem. Indeed if both the source and target measures are absolutely 
continuous with respect to the Lebesgue measure on M. d , dfi/dx = f(x), 
du/dx = g(x), Brenier [lj showed that the L 2 optimal transport problem has 
a unique invertible solution T (// a.e.) that is characterized by the gradient 
of a convex function, T = V^. Moreover, if / and g are sufficiently regular 
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(in a sense to be specified later) , it is proved that \I/ is of class C 2 and satisfies 
the Monge-Ampere equation 



g(V^(x))det(D 2 ^(x)) = f(x), (2) 

sec Delanoe Q, Caffarelli (JQ, Urbas j26|). Therefore, for smooth source 
and target probability densities / and g, a convex solution \l/ to the Monge- 
Ampere equation (j2]) provides the optimal solution T = V\I/ to the L 2 optimal 
transport problem. 

In this paper, we are interested in the numerical resolution of the L 2 op- 
timal transport problem. Concerning this issue, only few numerical methods 
are available in the literature, e.g. jl, Q, EH- Even if some of these methods 
are efficient, they all have issues that call for improvement, most of the time 
regarding their convergence (which is not always guaranteed). Therefore, 
the numerical results they produce are sometimes not satisfactory. Although 
this list is not exhaustive, a more elaborate discussion on the advantages and 
disadvantages of each of these methods is given in the concluding remarks in 
Section [6j Our goal in this paper is to present an efficient numerical method 
which is stable and guaranteed to converge (before discretization). For that, 
contrarily to the previously existing methods, we propose to solve numerically 
the Monge-Ampere equation (J2J) for a convex solution then producing the 
solution T = V\l/ to the L 2 optimal transport problem. The numerical resolu- 
tion of the Monge-Ampere equation is still a challenge, though some progress 



has been made recently, e.g. [16j,[19[]. In [16], Loeper and Rapetti considered 
the Monge-Ampere equation ([2]) in the particular case where the target den- 
sity g is uniform, g — 1, and used a damped Newton algorithm to solve the 
equation. They also provided a proof of convergence of their method (before 
discretization) under some appropriate regularity assumptions on the initial 
density. However, their assumption on the final density, (i.e. g — 1), is too 
restrictive and therefore strongly limits the potential applications of their re- 
sult, from an optimal transport point of view. Here, we extend their method 
to the general case where the final density g is arbitrary (but sufficiently 
regular), which is the general context of the optimal transport problem, and 
we make their algorithm more efficient by presenting several numerical im- 



provements. This is a novelty in our work compared to [16(. Specifically, 
we approximate the fully nonlinear PDE (J2J) by a sequence of linear elliptic 
PDE's via a damped Newton algorithm. Then we extend the convergence 



result of [l6j to the more general context where the final density g is arbitrary 
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(under some suitable regularity assumptions). Here several new techniques 
are introduced due to the difficulties arising from final density g which is no 
more uniform in our work. Moreover, we present several numerical improve- 



ments to the method introduced in [16j . More precisely, we solve the linear 



PDE's approximating (j2J) using two different discretization methods, namely, 



a standard second order finite difference implementation, used in [16], and 
a fast Fourier transform (FFT) implementation (see Strain [24]). The FFT 
algorithm provides what appears to be a globally stable C(PLogP) method. 
In addition to this FFT speed up compared to (l6| . we also use for both 
implementations fourth order centered differences to approximate the first 
and second order derivatives involved in the nonlinear right-hand side of the 
Newton algorithm. To prove the theoretical convergence of the solutions of 
the linear elliptic PDE's to the actual convex solution of the Monge- Ampere 
equation (j2J), we exploit interior a priori estimates on the classical solutions 
to the Monge- Ampere equation. As far as we know, no global estimates that 
we could use are available for this equation. We thus restrict ourselves to a 
periodic setting to take advantage of the local estimates. Even with this re- 
striction, our numerical method still gives in practice very good results when 
applied to a wide range of examples, periodic or non-periodic (see Section 
E]). 

The paper is organized as follows. In Section [2] we present the problem 
together with some background results that will be used later. In Section 
[31 we introduce the damped Newton algorithm for the Monge- Ampere equa- 
tion in the general case of the L 2 optimal transport problem and discuss its 
convergence. In Section 01 we propose two different ways of discretizing the 
algorithm, and then test these implementations on three examples in Section 
One of these examples is taken from Medical imaging, namely, the de- 
tection of the multiple sclerosis (MS) disease in a brain magnetic resonance 
imaging (MRI) scan. Finally, we conclude with some remarks in Section [61 



2. Problem setting 

In what follows, d > 1 is an integer, and we denote by e, the i^-canonical 
unit vector of M d . A function ( : M. d — > R is said to be 1-periodic if ((x + ei) = 
C(x) for all x G lR d and % G {1, • • • , d}. Note that for such a function, its values 
on the subset Q : = [0, l} d of R d are sufficient to define its entire values on 
the whole space R d . Based on this remark, we will identify in the sequel 
1-periodic functions on M. d with their restrictions on Q = [0, l] d . Now, let \i 
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and v be two probability measures absolutely continuous with respect to the 
Lebesgue measure on M d , and assume that their respective densities / and 
g are 1-periodic. Then /, g : Q — > R, and the L 2 optimal transport problem 
with these densities reads as 



Moreover, the unique solution T = V\l/ to this problem (where \I/ : O — > 
is convex) satisfies the Monge- Ampere equation (j2J) on Q. The regularity 
and boundary conditions corresponding to this Monge- Ampere equation are 
given by the following theorem due to Cordero-Erausquin [6j. 

Theorem 2.1. Assume that //, u, f and g are defined as above and letrrif, m g , 
Mf and M g denote the infima and suprema of f and g respectively. Then 
there exists a convex function \& : Q — > Q which pushes \x forward to v (i. e. 
(V^^/i = v) such that is additive in the sense that V\l/(x + p) = 
V^(x) + p for almost every x G M. d and for all p 6 Z d Moreover, V\l/ is 
unique and invertible (fi a.e.), and its inverse V$ satisfies (V$)#z/ = \i. 
In addition if f and g are of class C a (fl) with a > and if m g ,Mf > 0, 
then ^ G C 2, ^(Q) for some < j3 < a and it is a convex solution of the 
Monge-Ampere equation |D). 

Note that since V\l/ is additive, it can be written as x plus the gradient of a 
1-periodic function. Thus, we assume = \x\ 2 /2 + u(x) with Vu(x+p) = 
Vu(x) for all p G Z d , i.e. u is 1-periodic. So by using this change of function, 
ty(x) = \x\ 2 /2 + u(x), in the Monge-Ampere equation (j2J), we see that the 
corresponding equation in u satisfies a periodic boundary condition on Q. 
This justifies why we introduce this change of function in Section |3]to rewrite 
Eq. (T5]). In fact, the periodic boundary conditions will allow us to use interior 
a priori estimates for classical solutions of the Monge-Ampere equation on 
the whole domain in order to prove the convergence of our algorithm (see 
Section I3.2p . We also infer from this theorem that if /, g G C a (Q), then 
\]> £ C 2 'P{VL) is the unique (up to a constant) convex solution of the Monge- 
Ampere equation (J2J) on Q. Finally, classical bootstraping arguments from 
the theory of elliptic regularity can be used to prove that if /, g G C k,a (Q), 



inf 




x — T(x) \ 2 dfi(x); T#fi = u, dfi(x) = f(x)dx, du(x) = g(x)dx 




then * G C k+2 ^{Q). 



5 



3. The Damped Newton algorithm 

3.1. Derivation of the algorithm 



Loeper and Rapetti presented in [16| a numerical method based on New- 
ton's algorithm to solve the equation 

det(£> 2 *) = fix) 

in a periodic setting. This equation can be associated with the optimal 
transport problem in the case where the target measure v has a uniform 
density, i.e. g — 1. Here, we propose to extend this algorithm and the 
underlying analysis to the general case of an arbitrary smooth 1-periodic 
density g. Motivated by the remark made in Section [5J we follow |l6] and 
introduce the change of function ty(x) = \x\ 2 /2 + u(x) to rewrite the Monge- 
Ampere equation ([2J in the equivalent form 

M{u) = g{x + Vu{x)) det(X + D 2 u(x)) = f(x). (4) 

Therefore, we will solve (j3J) for a 1-periodic solution u such that \x\ 2 /2 + u 
is convex on Q = [0, l] d . Since we want to develop an algorithm based on 
Newton's method, we first linearize (jlj). Indeed, using the formula for the 
derivative of the determinant ly, [20|, we have 

det (X+D 2 (u + s9)) = det (X + D 2 u) + s Tr (Adj(X + D 2 u)D 2 6) + 0{s 2 ) 

where Adj(A) = det (A) ■ A^ 1 . Also, from the usual Taylor expansion, we 
have 

g(x + V(« + s9)) = g(x + V«) + s Vg(x + Vm) • V6 + 0(s 2 ). 

Multiplying the latter two expressions, we obtain that the derivative in di- 
rection 9 of the right hand side of equation (j4|), denoted D U M ■ 9, is given 
by 

g(x + Vu) Tr (Adj (X + D 2 u)D 2 9) + det (X + D 2 u) Vg(x + Vu)- V0. (5) 

With this linearization at hand, we can now present the damped Newton 
algorithm that we will use to solve equation (jlj). 
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Damped Newton algorithm 

With Mo given, loop over n G N 

Compute /„ = g(x + Vu n ) det (X + D 2 u n ) 

< Solve the linearized Monge- Ampere equation 

1 



(6) 



D„._M ■ Or, 



T 



if-fn 



Update the solution : -u nH 



r . 



The factor 1/r (r > 1) in the algorithm is used as a step-size parameter 
to help preventing the method from diverging by taking a step that goes 
"too far". As we will see below, the value of r is crucial for the proof of 
convergence of the algorithm. Indeed, we will show that if we start with a 
constant initial guess for the Newton method, then there is a r such that 
the method will converge (provided some extra conditions on the densities 



are satisfied). Furthermore, by modifying some results presented in [11 
it is possible to prove that a second order linear strictly elliptic pde with 
periodic boundary conditions has a unique solution up to a constant if its 
zeroth-order coefficient is 0. The linearized Monge- Ampere equation at step 
n, which we will denote by L n , falls into this category through the setting 
of the algorithm. To fix the value of that constant, we select the solution 
satisfying j Q udx = 0. This is guaranteed by choosing a n which satisfies 
this condition for every n. 



3. 2. Proof of Convergence 

To prove the theoretical convergence of our algorithm (jH]), we follow the 
but we introduce several new key steps to deal with the 



16 



arguments m 

non-uniform final density g. In particular, we rely on three a priori estimates 
for the solution of the Monge-Ampere equation. The first one is derived by 
Liu, Trudinger and Wang 13] and goes as follows: if A < f/g < A for some 
positive constants A, A, and if / G C a (Q) for some a G (0, 1), then a convex 
solution of (El) satisfies 



|^||c 2 .«(n ri ) < Ri 



1 + 



1 


f( x ) 




a(l — a) 


g(VMx))) 


c a (n) 



(7) 



where Q n = {x G Q : dist(x, dfl) > ri} and Ri is a constant which depends 



only on d, r 1; A, A and Q. The second one, discovered by Caffarelli 27] and 
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expressed by Forzani and Maldonado [9], presents a bound on the Holder 
component of V\P. It states that if / and g are as in the previous estimate, 
there exists some constant k such that 



< R,, 



K 



i+* / M(^,y,r 2 ) 



for | ^ — 2/| < r 2 , where ip y (x) = *&(x + y) — *&(y) — V^(y) ■ x, 

r { KM(V,y,r 2 ) \&\ 
R v = max < 1, ; — > 



i 

i+fc 



(8) 



and M(\E f , r 2 ), rn(^ r , r 2 ) denote respectively the maximum and minimum 
of ipy(z-y) taken over the points z such that \z — y\ = r 2 . Since this estimate 
docs not hold for all k. one might wonder for which values it is actually valid. 
In 19J, it is shown that it holds for 



k = 2K(K + 1), K 



2 3d+2 w d w d . 1 A 
d~ 3 / 2 A 



and w r . 



71 



m/2 



T(m/2 



Here w m is the volume of the m-dimensional unit ball. To give the reader an 
idea of these values, a few of them are presented in Table Q] below. 



d 


w d 


Rounded K 


1 


2 


64C 


2 


71 


4550C 


3 


4tt /3 


140039C 


4 


7T 2 /2 


2709370C 



A 



Table 1: Quantities involved in the bounds on — for C = — 

A 



Finally, the third estimate controls the growth of the second derivatives of 
\1/ with respect to boundary values, provided / jg G C 2 (Q) and \& G C 4 (fi) 



11. 25 



sup|L> 2 ^| < R 3 ( l + sup|L> 2 ^| ) (9) 
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where R 3 depends only on Q, d, f/g and on the sup of \I/ + V\& in fl. We 
can now state and prove the theorem on the convergence of Algorithm 



Note that the arguments of the proof are similar to [16|, but the fact that 
the target density g is non-uniform here introduces some new difficulties 
that are worthwhile exposing. In addition, the proof provides important 
information that can be used to gain some intuition about the performance 
of the algorithm in practice. 

Theorem 3.1. Assume that Q = [0, l] d and let f, g be two positive 1-periodic 
probability densities bounded away from 0. Assume that the initial guess u 
for the Newton algorithm (0|) is constant. Then if f G C 2,a (Q) and g G 
C 3,a (Q) for any < a < 1, then there exits r > 1 such that (u n ) converges 
in C 4,l3 (Q), for any < < a, to the unique - up to a constant - solution 
u of the Monge-Ampere equation Moreover, t depends only on a, d, 

1 1 /lichen)) IMIc 3 ^^) an d Mf, Mg,rrif, m g which are defined as in theorem \2.1\ 

Proof. First we note that due to the additivity of the transport map, by 
applying the change of variable y = T n (x) = x + Vu n (x), we can prove that 
for all n, 

g(y) dy = g(f n (x)) det(DT n (x)) dx = / /„ dx = 1, 
J T n (si) Jn Jn 

i.e. at every step, we are solving the optimal transport problem sending f n to 
g. Moreover, unless otherwise stated, we only need to assume that / G C a (fl) 
and g G C 2,a (Q). The main steps of the proof consist in showing by induction 
that the following claims hold for all n : 

1.1 + D 2 u n and Adj(X + D 2 u n ) are C a (Q) smooth, uniformly positive 
definite (u.p.d.) matrices, where X denotes the identity matrix. 
/ 

2. — < f n < Cif, where C\ is independent of n. 
Li 

3- ||/ — f n \\c a (n) < Gii where C 2 is independent of n. 

We say that a matrix A is C a (Q) smooth if all of its coefficients are in C a (Q); 
it is u.p.d. (uniformly positive definite) if there exists a constant k > 
such that £, T Al; > k\C,\ 2 for all £ G Q. It is also worth mentioning that the 
statement in 1) actually implies that \l/ n = \x\ 2 /2 + u n is uniformly convex 
and that L n is a strictly elliptic linear operator. 
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Note that for w constant, we have /q = g. Next, let 

d = max (—, — J and C 2 = ||/||e«»(n) + lblle<*(fi)- 
\m g iTif J 

Then, it is easy to see that all the claims 1), 2) and 3) hold for n = 0. Let's 
assume they hold for a certain n e N and prove them for n + 1. For now, 
we suppose that the step-size parameter could vary with n. We shall prove 
later that we can actually take it to be constant without affecting any result. 
Let 9 n be the unique solution of L n 9 n — (/ — / n )/r such that J Q 6 n dx = 0. 
According to the results of [ijj (modified for the periodic case) there exists 
a constant k$ n such that 

ll^nllewni — — — fn\\c<*(n) — z ' i = l,2. (10) 



'n\\ci,°>(n) — \\J Jn\\C<*(Sl) 

Because u n +i = u n +9 n , we deduce that X+D 2 u n+ i and then Kd]{X+D 2 u n+ i) 
are C a (fl) smooth. Now, since X + D 2 u n is u.p.d., by assumption we get: 

eiX + D 2 u n+l )i = ?(l + D 2 u n )Z + f(D 2 6 n )Z 

> mf-^u-fniic^ib^ 

> ^| a -^ll/-/»ll c . ( n,Efe 2 + ^) 



kg n d 

K\ — fn\\c<*(Q) 



> K 2 \£\ 



T 

2 



for r large enough, where K 2 is a positive constant. Hence X + D 2 u n+ \ is 
a u.p.d. matrix. Next, inspired by the Taylor expansions previously shown, 
we write f n +\ in terms of /„ as follows: 

fn+i = g(x + V« n+ i) det (X + D 2 u n+ i) 

= g(x + VmJ det (X + D 2 u n ) + L n 9 n + r n 

= fn + t^ + r n . (11) 

T 

Now we bound the residual r n . It is easy to see that an explicit formula for 
r n can be obtained from the second order terms of the Taylor expansion of 
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the Monge- Ampere operator, and it consists of a sum of a bunch of products 
of at least two first or second derivatives of 9 n with g and its derivatives 
evaluated at V\I/ n and with second derivatives of By ( ITU]) , we know that 
we can bound the C a (Q) norm of the second derivatives of 8 n by a constant 
times ||/ — f n \\c c >(tt)/ T - In addition, since g G C 2 ' a (Q), the Holder norm of 
g and its first and second derivatives are all uniformly bounded. We then 
deduce that 

k 

\\r n \\c*(n) < -j- 11/ - fn\\c<*(p)i ( 12 ) 

where k Tn could potentially depend on the Holder norms of the first and 
second derivatives of \lV Next, by selecting r > 2k Tn C 2 , (fTTI) . (|T2j) and 3) 
imply 

\\f — fn+i\\c a (n) < fl — -J 11/ — /n||c a (n) + ~jjrll/ — /«llc Q (n) 

1 , fcr n C 2 



< ||/-/n|Um 1-- 



r t 2 
1 — 11/ — fn\\c a (n)- 

This shows that bound 3) is preserved for r large enough. In addition, it 
shows that we can take the step-size r such that the sequence of bounds K 2 
created recursively will converge to a constant strictly greater than 0. Let's 
now verify bound 2). If we take r > fc rn Cf/[m/(l — ^-)], from all the previous 
results and hypothesis we get 

f — fn+l < (/ — /n) + ^y||/ — /n||c«(n) 



2 



i 



-2 



from which we deduce that j jC\ < / n +i- Following a similar approach with 
a step-size r > &r„C2/[ m /(Ci — 1)], we obtain the other part of 2). Then, we 
go back and finish the proof of the first statement. Knowing that X+D 2 u n+ i 
is u.p.d., we see that det (X + D 2 u n+ \) > and therefore X+ D 2 u n+ \ is 
invertible. We can prove that its inverse is also a u.p.d. matrix. Indeed, if 
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£ = (X + D 2 u n+ i)y G fi, we have 



£ T (z + rAz„ +1 ) ^ = [((X + rAz„ +1 )y) T (X + rAi„ +1 ) 1 

((I + 2 Vi)!/) 
= y T (l + D 2 u n+1 )y > K 3 \y\ 2 = K 3 \(l + D 2 u n+1 )-^\ 2 . 

Using the inequality \AB\ < \A\\B\ with A = X + D 2 u n+1 and B = (X + 
-D 2 tt n +i) _1 £, we obtain |£| < |X+X> 2 ii„+i| |(X+D 2 w n+ i)~ 1 £|- Next, motivated 
by the equivalence of norms, we use the bounds we derived previously to get 

\X+D 2 u n+1 \ < drnax||(X+ D 2 u n+1 )ij\j 

< d(l + ||« n ||c2,«(n) + ||^rt||c 2 .«(n) 1 < #4 

where is a positive constant. This yields the claim: 



e (i+D 2 u n+1 ) x £> 



>||iei 2 = K 5 iei 2 , ^ 5 >o. 



IX+DV+iI 2 - #i 
We now use these statements to show that L n+ i is a strictly elliptic operator, 

d 

v Adi fX+D 2 ^^',.. 



g{x + Vu n+1 ) Ad J ( X + ^Vfi)^- 6& > fn+iK 3 



> ^K 3 iei 2 >^ 6 iei 2 . 

Note that by removing g from the previous inequalities, we get that Adj (X + 
D 2 u n+ i) is a u.p.d. matrix, which completes the proof of 1). Now, we show 
that the step-size r can be taken constant, as claimed before. Indeed, 1) 
gives \I/ n G C 2 ' a {VL) by construction while 2) yields /„ G C a (f2) and 



< < 



dM 9 - g(x + V« r 



< 



m r . 



Therefore, all the conditions to the estimate ([7]) are satisfied at every step. 
Using inequalities on Holder norms, we find 



f a 



C a (il) 



< \\fn\\c« 



(O) 



(13) 
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At this point, the only remaining challenge is to bound || V^/ n \\ c ^^ n y It can 
be achieved through the second estimate flE])- Since V^n is the transport 
map moving f n to g, we can refer to Theorem 12.11 to deduce that V\I/ n 
is invertible and thus V\I/ n G f2, which in turn yields \I/ n G [0, \fd\ when 
Q = [0, l] d . Therefore, we see that the maximum terms M(\l/, y, r 2 ) are going 
to be uniformly bounded and that the only problem could come from the 
minimum terms rn(ip^ , 0, a ), a = 1 or R y . Using ideas from convex analysis 



(presented for example in [12])), we can show that since \l/ n is uniformly 
convex for every n, we have m(^*^,0,a) = min ^* (z) where the minimum 
is taken on the sphere \z\ = a, a > 1 (with the periodicity we can increase 
the size of Q to include it inside and still have a uniform bound on \l/ n and 
V\I/ n ). Furthermore, V^* (z) = if and only if z = 0, V^* is strictly 
monotone increasing because Vip ny is and V^ 1 = V^. We see that the 
only possible breakdown happens when V^* y converges to a function which 
is zero up to \z\ = a. This means |V^ n J = |V\l/ n (x + y) — V^ n (y)\ —> oo 
as | a; | — > and n — > oo, for any y. Observe now that if we increase the 
regularity of the densities to / G C 2 ' Q (fi), g G C 3 '"^), we get /„ G C 2 ' Q (fi) at 
every step. This tells us that n G C 4 ' Q (fi) (see (ll|) and thus ^ n G C 4 ' Q (fi). 
Therefore, we can apply estimate OH]) and rule out this potential breakdown 
case. We obtain that the C 2,a (Q r ) norm of \l/ n is uniformly bounded and thus 
by the additivity of that function in a periodic setting, the same conclusion 
holds for its C 2,a (Q) norm. Hence, we deduce that it is also the case for k Tn 
and then kg n . From this, we get that we can select a r > 1 constant such 
that the three statements hold for all n G N by induction. Moreover, the 
sequence {u n ) n& n is uniformly bounded in C 2,a (Q), thus equicontinuous. By 
the Ascoli-Arzela theorem, it converges uniformly in C 2,/3 (f2) for < j3 < a 
to the solution u of PJ, which is unique since we impose J n udx = 0. Finally, 
due to the fact that the initial and final densities are actually C 2,a (Q), we 
know that this solution will be in C A, "(Q). □ 

3.3. Remarks on the Proof 

This proof by induction provides a lot of precious information concerning 
the properties of the iterates created by our method. First, since X+D 2 u n is 
u.p.d. at every step, we realize that the sequence of functions \l/ n is actually 
one of uniformly convex functions. Recall that the Monge-Ampere equation 
02]) is elliptic only when we restrict it to the space of convex functions. There- 
fore, the algorithm is extra careful by approximating the convex solution of 
the Monge-Ampere equation by a sequence of uniformly convex functions. In 
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addition, this guarantees that the linearized equation is strictly elliptic and 
thus has a unique solution (once we fix the constant). Furthermore, just like 



m 



161 ]. we can obtain estimates on the speed of convergence of the method. 



Indeed, assuming that r > 2k res C2, we got 

11/ - fn+i\\c(n) < (l - ^) ]]/ - fn\\c"{Q.) 

which tells us that (/ n ) converges to / following a geometric convergence with 
a rate of at least 1 — l/2r. When it comes to the step-size parameter r, it 
would be very useful to know a priori which value to select in order to make 
the algorithm converge. Such an estimate is unfortunately hard to acquire 
since some of the constants used through interior bounds are obtained via 
rather indirect arguments. However, we observe from lower bounds on r used 
in the proof, i.e. 

k r 2 k r 2 

r > T 8 2 , , r > , „ rcs ° 2 or r > 2k ies C 2 , 



i\ - (Ck - l)m f 



that the minimum value required on the step-size parameter to achieve con- 
vergence could potentially be large when m/ is close to or when either 
||/||c a (n) or ||g||c Q (c) is large. Through the numerous numerical experiments 
we conducted, we realized that r seems to behave according to both condi- 
tions. Therefore, knowing a priori that / could get close to 0, we can react 
accordingly by either increasing the value of the step-size parameter or by 
modifying the representation of the densities (which is possible in some ap- 
plications). Finally, even if our proof only guarantees convergence when the 
update 9 n is the solution of flHD, in practice we can get good results by replac- 
ing it by the solution of g(x + Vu n ) Tr (Adj(X + D 2 u n )D 2 9 n ) = (/ - f n )/r, 
or sometimes by an even simpler equation. 



4. Numerical Discretization 

We present here a two-dimensional implementation of the Newton algo- 
rithm (Q. We consider a uniform N x N grid with a space-step h = 1/N 
where we identify Xi = with X{ — 1 (i — 1, 2) by the periodicity. It is easy 
to see that the most important step for the efficiency of the method is the 
resolution of the linearized Monge-Ampere equation. Indeed if we take P to 
be the number of points on the grid (P = N 2 in 2D), as every other step 
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can be done in 0(P) operations, the computational complexity of the whole 
method is dictated by the resolution of this linear pde. Therefore, we will in- 
troduce below two methods for solving this equation. For the other steps, we 
employ fourth-order accurate centered finite differences for the discretization 
of the first and second derivatives of Un. We thus improve considerably the 



accuracy of the results compared to [16| where second order differences are 



used to approximate these terms, but at the same time we do not decrease 
the efficiency of the whole algorithm whose complexity is dominated by the 
resolution of the linear PDE. If we know g explicitly, then we can compute the 
compositions g(x+ Vm„) and Vg(x+Vu n ) directly. However, it is not always 
the case, especially when we deal with discrete data as in the examples of im- 
age processing in Section In such circumstances, we have to approximate 
them. To do so, a popular choice would be to use a linear interpolation but 
in practice, we find that using only a closest neighbour interpolation gives 
good results in most scenarios. Another salient point is that even though in 
theory /„ has a total mass of 1 at every step, it is not necessarily the case in 
the numerical experiments, due to discretization errors. However, we need 
the right-hand side of the linearized Monge- Ampere equation to integrate to 
on the whole domain. To deal with this, we introduce a normalization step 
right after computing /„ in the implemented algorithm, taking 

1 

fn = fn~ ^ ] fn itj + 1 (14) 

i,j=0 

instead of f n and thus translating it at every step. 

4-1. A Finite Differences Implementation 

We begin by presenting an implementation of the resolution of the lin- 
earized Monge-Ampere equation through finite differences. This choice is 
motivated by the fact that it is the method chosen by Loper and Rapetti in 



16l | for their corresponding algorithm. In this case, to reduce the complexity 
of the code, we only use centered finite differences of second-order for the 
derivatives of 6 n . Since the linear pde has a unique solution only up to a 
constant, the linear system Ax = b corresponding to its discretization has 
one free parameter that we need to fix to turn the matrix into an invertible 
one. A possible strategy to achieve this is to create a new system Ax = b 
by adding the extra equation Yl x k = 0; which corresponds to selecting 9 n 
such that J n 8 n dx = 0. Note that this new matrix has full rank, but it is 
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not square. Then, we take that extra line, add it to all the other lines of A 
and then delete it to get a square system, Ax = b. The next lemma shows 
conditions under which the resolution of Ax = b will produce a valid answer 
to the system Ax = b. For the sake of notation, consider the new equation 
to be stored in the first line of A. 

Lemma 4.1. Let A and A as defined above, i.e., A is a (P + 1) x P matrix 
with rank P = N 2 , and there exist real numbers oti, 012, • • • , «p+i not all zero 
such that ct\L\ + 02-^2 + . . . + ap + \Lp + i = where Li is the ith line of A. 
If 0C2 + ■ ■ ■ + ctp+i 7^ Oil, then A has rank P. 

The proof is a straightforward use of matrix algebra and therefore not 
reported here for brevity. This lemma does not hold if condition 0.2 + ... + 
ap + i 7^ oti is not satisfied. Take for example a matrix A such that its second 
line is equal to the negative of its first line and all its other lines are linearly 
independent. Then A has rank P — 1. Nonetheless, due to the structure of 
our problem, this is not going to happen. Unfortunately, this strategy has 
the downside of somewhat destroying the sparsity of the matrix. One way 
to avoid this would be to equivalently fix only the value of 9 at a given point 
and then use the same strategy. This would preserve most of the sparsity of 
the matrix. 

Next, to actually solve the system Ax = b, we employ the Biconjugate 
Gradient (BICG) iterative method. This choice can be justified by the fact 
that we are dealing with a (potentially sparse) matrix which is not symmetric 
nor positive definite; the BICG procedure being specifically designed to deal 



with these conditions [22|. One feature of this method is that, provided the 
method does not break down before, the sequence of approximate solutions 
it produces is guaranteed to converge to the solution of the linear system in 
a maximum of P steps, which yields a computational complexity of at worst 
0(P 2 ). However, as we shall see later, in practice it can be much smaller 
than that. In addition, even if the BICG algorithm is commonly employed 
with a preconditionner, we did not find the need to do so while conducting 
our numerical experiments. 

4-2. A Fourier Transform Implementation 

One should realize that the first implementation we employed to solve the 
linearized Monge-Ampere equation might not be the best method. Indeed, 
there exist much cheaper ways to solve a linear second-order strictly elliptic 
equation with such boundary conditions. The one we are going to explore 
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here is due to Strain [24| and requires only O(PLogP) operations through 
the use of the FFT algorithm. It consists in rewriting the problem as the 

system 

L n L~ a(x) = h(x) 
L n 6(x) = a(x), 

where L n is the averaged L n in the sense that its coefficients are the integral 
over Q of the coefficients of L n . We then expand a in Fourier series by taking 



ct(.v )= V a(k)e 27Tik - x and a{k) = [ a(x)e~ 27rik - x dx, 



where i representing and &(k) being the usual Fourier coefficients. Using 
this expansion in the first part of (fl5l) yields the formula 

d 

LL~ l o(x) = J2a i:j {x) ^ik^mkjpityaiky^ 

i,j=l k^0eZ d 

d 

+ J2bi(x) ^ik lP -(k)a(k)e 2 * 

i=l k=/=0eZ d 
d d 

= a ij( x ) a ij( x ) + y^ j b i (x)/3j(x) 

i,j=l i=l 



ik-x 



where 



if the sum is not 



P(h) = \ J2i,j=i (i, fl~ik;l-ik :i + b^niki (16) 
otherwise. 

For the discretized problem, knowing the value of a, we can compute a with 
one application of the FFT algorithm and then compute aij and with 
d(d + 1) applications of the inverse FFT algorithm to be able to get the value 
of LL &(x) in O(PLogP) operations. Therefore, we can use an iterative 
method to solve the first equation of system fTl5|) at a cost of O(PLogP) 



operations per iteration. As in Strain [24(, we use the Generalized Minimal 



Residual method, or GMRES. Just like BICG, it is an efficient way of solving 
a linear system of equations where the matrix is non-symmetric and non- 



positive definite [22(. Moreover, GMRES does not use projections on the 
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Krylov subspace generated by the transposed matrix. This makes it easier 
to code for the particular setting we are dealing with since we do not form A 
directly; we reference it instead through the result of its product with a given 
vector o~. Strain observed that the number of GMRES iterations required did 
not vary with P, which yields a global complexity of O(PLogP). Note that 
for better performances, we actually employ like the author the restarted 
GMRES(m) method. After computing cr(x), we need to solve L6(x) = a(x). 
This can be easily achieved since we already know the value of L a(x). 
More specifically, we have 

6{x)= P(k)*(k)e 2 ™ k - X , 

i.e. it requires only one other application of the (inverse) FFT algorithm 
to obtain 9. On top of the efficiency of this method, observe that it has 
other advantages. It is spectrally accurate, i.e. the error decreases faster 
than any power of the grid size as the space-step size goes to 0. We can also 
prove that the convergence rate for the GMRES algorithm is independent 
of the grid size. For more details, one should consult the original paper 



24(. In the actual discretization of this method, we truncate the sums in 



the usual way by varying \k\ from —N/2 to N/2. We compute the averages 
of the operator's coefficients with Simpson's numerical integration formula. 
Finally, the discrete linear system still has a solution unique only up to a 
constant and we can use the same strategy as in the previous case to fix it. 



5. Numerical Tests 

5.1. A theoretical example 

Our goal here is to observe and compare the behaviour of the two imple- 
mentations. Starting with a known u and a known g, we compute the corre- 
sponding right-hand side / with (jl]), and then we run the algorithm to obtain 
u n . We consider functions of the form 

m(xi,x 2 ) = — cos (27T7Xi) sin(27r7£ 2 ), 
k 

g(xi,x 2 ) = 1 + acos (2iipxi) cos(2iipx2). 

For the first implementation we select for the BICG algorithm a tolerance of 
10 -4 and a maximum number of 1000 iterations per Newton step. For the 
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(a) The \\u — u n ||/2 error for the Fourier (b) The ||tt — u n \\i2 error for the finite dif- 
transform implementation on a semilog ferences implementation on a semilog plot 
plot 




Figure 1: Error behavior for u n and /„ for a tolerance of 10 



FFT implementation, we take the same tolerance with a restarting threshold 
of m = 10 inner iterations for the GMRES algorithm. In both cases, a 
value of r = 1 was enough to achieve convergence. The errors \\u — u n \\i2 
and ||/ — f n \\p are plotted in Figured] as functions of the Newton iterations 
for both the FFT and finite differences and for various grid sizes ranging 
from 16 x 16 to 256 x 256 grid points. We see that in both cases the error 
gets smaller as we increase the grid size. In particular, for this value of r, 
after the first 4 iterations or so, where ||w — u n \\i2 settles down very quickly, 
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the convergence of ||/ — f n \\p follows a linear slope with a convergence rate 
slightly faster than a half. The estimated ratio is actually about 0.45 in the 
FFT case and is about 0.33 in the finite differences case, so the convergence is 
faster in this latter case for this final stage. Computing the observed order of 
accuracy from the errors between u and u n , we get from smaller to bigger grid 
sizes, 4.3521,4.0035,3.9965 and 3.9990. This confirms that the fourth-order 
is consistent with the order of the finite difference scheme used to compute 
the right-hand side. 

In order to investigate whether we can decrease the computing time with- 
out loosing too much precision on the results, we try to run the experiment 
again with a tolerance 10 _1 (see Figure [2]). Due to the looser tolerance em- 
ployed, the results are a bit erratic for the finite differences implementation, 
but overall still very good. Figure 2(c) shows the 3d plot of u—u n for N = 128 
in the Fourier transform case to get an idea of the distribution of the errors. 
As we can see, they seem evenly distributed on the whole domain. Figure 



2(d) depicts what happens when we vary the value of the step-size parameter 



t. The results behave accordingly to our expectations, with a slower con- 
vergence for a bigger r. Note that for this new tolerance, the computational 
cost of one iteration is now much less expensive and the global computing 
time decreases a lot in both cases. We can quantify this by looking at Ta- 
ble [2j Observe that the BICG algorithm required less operations than the 
worst case scenario 0(P 2 ). This being said, we still realize at first glance 
that the FFT method is much faster than the finite difference method. The 
number of GMRES iterations per Newton iteration stayed nearly constant 
as we increased the grid size, which confirms the O(PLogP) computational 
complexity. 

Finally, in order to get an idea of the stability properties of both methods, 
we can measure the norm of the inverse of the matrix corresponding to the 
discretization of the linearized Monge- Ampere operator (see for example [Hi] 
for more information on the stability concept for iterative methods). This 
can be achieved by computing the spectral radius of such matrix. In the 
finite differences case, we could get this eigenvalue directly by first obtain- 
ing the inverse, and then computing the eigenvalues of the new matrix. We 
observed that for the current experiment, the spectral radius starts at about 
10 for N = 16 and then grows almost linearly as we increase the grid size. 
Hence, the finite difference method appears unstable. For the FFT case, 
since we do not possess an explicit representation of the matrix, we have to 
use an indirect method to compute the spectral radius. The one we select 



20 




Number of Newton Iterations Number of Newton Iterations 



(a) The \\u — u n ||/2 error for the Fourier (b) The — u n \\i2 error for the finite dif- 
transform implementation on a semilog ferences implementation on a semilog plot 
plot 




(c) Surface plot of u — uq for the Fourier (d) The ||u — u n ||;2 error for the Fourier 
transform implementation in the N — 128 transform implementation with different 
case. values of r in the N — 128 case on a 

semilog plot 

Figure 2: Several examples of the results obtained with a tolerance of 10 _1 . 



is the power method (or power iteration). For a matrix A, it starts with 
a vector 6 and compute the iterates bk+i = Abk/\\Ab)-\\. If A has a domi- 
nant eigenvalue and if bo has a non-zero component in the direction of the 
eigenvector associated with this largest eigenvalue, then the sequence (bk) 
converges to the eigenvector associated with the spectral radius of A (see 
l~ol|). For the current experiment, we apply this technique to the inverse ma- 



trix produced at every Newton step n by the discretization of the linearized 
Monge- Ampere equation via the FFT implementation. More specifically, for 
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computing time 
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computing time 


N 


of 


for 


of 


for 




GMRES iterations 


Fourier 


BICG iterations 


Finite 






Transforms 




Differences 


16 


5.32 


1.07 


14.21 


2.21 


32 


6.37 


1.94 


17.79 


8.70 


64 


7.32 


8.06 


31.11 


79.17 


128 


7.95 


34.38 


63.32 


1221.10 


256 


8.05 


145.07 


134.63 


34639.82 



Table 2: Average number of BICG and GMRES iterations per Newton iteration and total 
computing time in seconds for the whole experiment (20 iterations) when the tolerance is 
set to lO^ 1 . We used a MATLAB implementation on an Intel Xeon running at 2.33 GHZ. 
This is presented for all the different grid sizes. 



a given n, we start with a &o randomly generated with components in [0, 1]. 
Then, using the method presented in Section I4.2[ we compute the product 
A^bk and then the iterate with A" 1 ^/^" 1 ^^. Repeating this procedure 
several times with different random vectors £> ? we observed that the power 
iteration always converges to a number close to 0.03 after only about k = 5 
iterations, for every n from to 20 and for every iV from 16 to 256. Thus, 
we conclude for this specific example that the spectral radius of the inverse 
matrix generated at every step of the Newton method is close to 0.03, which 
of course suggests that the FFT implementation is stable. 

5.2. Application to medical imaging 

Here, we test our algorithm on one of the various applications of opti- 
mal transport. In the area of image processing, one of the most common 
tasks performed by practitioners is to determine a geometric correspondence 
between images taken from the same scene in order to compare them or to 
integrate the information they contain to obtain more meaningful data. One 
could think of pictures acquired at different times, with different equipments 
or from different viewpoints. This process falls into the category of what is 
referred to as image registration. There are two main types of image regis- 
tration methods: the rigid ones which involve translations or rotations and 
the nonrigid ones where some stretching of the image is required to map 
it to the other one. People working on optimal transport recently realized 
that this theory could provide a good nonrigid image registration technique. 
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Indeed, consider for example two grayscale images. We could think of them 
as representing a mass distribution of the amount of light "piled up" at a 
given location. A bright pixel on that image would then represent a region 
with more mass whereas a darker pixel would correspond to a region with 
less mass. Computing the optimal map between the two images and analyz- 
ing the rate of change of that map could reveal the best way (in terms of 
minimizing the transportation distance) of moving the mass from the first 
density to the second, precisely showing what is changing on the images and 
how it is happening. In [2l[, Rehman et al. actually lists several advantages 
of the optimal transport method for image registration. However they also 
stress the fact that it is computationally expensive and this is one reason 
why it is important to find efficient numerical methods to solve this problem. 

Our first applied test is in the field of medical imagery. We consider 
the two brain MRI scans presented in Figure [31 These images were taken 
from the Brain Web simulated brain database at McGill university [sj and 
represent a slice of a healthy brain and a slice of the same brain where the 
multiple sclerosis disease is spreading. This nervous system disease damages 
the myelin sheets around the axons of the brain and leaves scars (or sclerosis) 
visible on an MRI. We chose MS as a test case since its actual detection pro- 
cess relies on neuro-imaging which tries to identify the scars whose presence 
leaves traces similar to multiple tumours. Note that because the scans are 
dark, their representation in greyscale contains many values close or equal to 
0. For the obvious reason of accuracy, we normalize the densities and bound 
them away from by applying the translation in f[T4|) on both of them be- 
fore initiating the algorithm. In addition, we rescale them so that they are 
exactly square (256 x 256 pixels). This is not required, but helps simplifying 
the code. On the bottom panels of Figure [31 we show the results reached 
after only 4 Newton iterations with a r = 1 and a tolerance of 10 -2 for the 
Fourier transform implementation. We note that the corresponding L 2 norm 
of the error between / and f n is reduced to about 0.001. The 3d plot in 
Figure 3(c) is characterized by very sharp spikes corresponding to variations 
in brightness between the two images where the scars are located. To get a 
better visual understanding of the situation, we also took the graph of the 
filtered contour plot, coloured in white the inside of the contour lines corre- 
sponding to the affected regions and we superposed this image to the MRI 
scan of the healthy brain (Figure 3(d)). The number of GMRES iterations 
required per Newton iteration was very small and nearly constant (only 1 
outer iteration and about 6 inner ones). Even if our code was not necessarily 
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(c) Surface plot of div(u4) (d) Scan of the healthy brain on which 

was superposed the coloured filtered 
contour plot of div(u4) 



Figure 3: The results of the MS detection experiment with the Fourier transform imple- 
mentation 
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optimized in terms of speed, it only took about 30 seconds to compute these 
results on an Intel Xeon with 2.33 GHZ of RAM. Moreover, the spectral ra- 
dius was still very close to 0.03 for the inverse discretization matrix, which is 
very encouraging. We also need to mention that we don't have access here to 
an analytical expression for / or g. Therefore, to compute an approximation 
for g(x + Vu n (x)) and Vg(x + Vu n (x)) at every grid point x = (ih,jh), we 
employ a closest neighbour approximation and like previously pointed out, 
the results are still very good. In addition to that change detection, the 
optimal transport plan T = x + Vm(i) actually gives us precise information 
on the amount of variation from one MRI scan to the other. Indeed, we 
can define a metric between probability densities from the solution to the 
transport problem; the distance being 

d(f,g) = [ \x-T(x)\ 2 f{x)dx = [ \Vu{x)\ 2 f{x)dx. 
Jn Jn 

This could quantify the magnitude of the change between the two images and 
thus help monitor the growth of the disease. In our experiment, we got a value 
of 4.14 x 10~ 10 . Note that when we compared it to other ones obtained from 
different numerical experiments on MRI scans with presence of MS, these 
numbers validated our visual estimates; more scars would produce a bigger 
number. Recall finally that even though we implemented it only in 2D, in 
theory it is valid in any dimension. Therefore, it could also be applicable on 
3D datasets which would be much more realistic when it comes to analyzing 
biological phenomena similar to the ones we treated here. 

5.3. A non-periodic example 

So far, we selected examples that were well suited for our periodic bound- 
ary conditions. Nonetheless, we claim that our algorithm could produce great 
performances on densities which are not necessarily periodic. To demonstrate 
this, we chose two famous pictures in the image processing community as ini- 
tial and final densities, namely Lena and Tiffany. First, we do a little bit 
of preprocessing by translating the initial pictures and by scaling them to 
exactly 256 x 256 pixels. We point out that we did not apply any smoothing 
to any of the images. Then, we select r = 2, tol= 10 _1 and run 20 iterations 
of the Newton algorithm with the Fourier series implementation. The output 
is presented in Figure HI Yet again, the number of GMRES iterations stayed 
nearly constant (only about 4 inner iterations) which made the computing 
time very small and the spectral radius of the inverse of the discretization 
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matrix stayed close to 0.03. One can observe the effect of the periodic bound- 
ary conditions (on f 5 for example). We see that the periodicity did not affect 
the overall outcome. Indeed, /20 and / are almost visually identical. If we 
use a linear interpolation instead of a closest neighbour interpolation, we 
could not visually tell the difference between / 20 and / since we were able 
to reduce a lot more the maximum of / — f n . This time, our method did 
not converge for a r equal to 1. This could be explained by the fact that 
the images vary a lot and even by translating, some values were still close to 
0. Moreover, when we repeat the same experiment with the finite difference 
implementation, the r required jumped to 8 and we had to iterate about 
60 times to get good results. This is yet another argument in favour of the 
Fourier transform implementation. 



6. Concluding remarks 

In conclusion, we saw that our algorithm presents a very good way of 
computing the numerical solution of the L 2 optimal transport problem. The 
Fourier Transform implementation makes it accurate, fast, stable and thus 
very efficient. In the context of image registration, the limitation to densities 
bounded away from and to periodic boundary conditions did not seem to 
be a serious shortcoming for applying this algorithm to important practical 
examples. We also saw that even if our method has the downside of having 
to choose a value of r without giving too much information on how to make 
this choice a priori, when using the Fourier transform implementation, in all 
our experiments we never had to take a r bigger than 2 to get convergence. 



We also conducted more numerical experiments in [23j], such as taking the 
initial and final densities to be the periodic approximation of gaussian dis- 
tributions, and the performances were as good as the ones presented here. 
The interested reader can consult |23|] for more details. Furthermore, com- 
pared to available numerical methods, our algorithm is very efficient. Indeed, 
Benamou and Brenier's Q use of the fluid dynamics reformulation of opti- 
mal transport introduces an extra time variable to the problem which is a 
non-necessary cost for all practical purposes of interest to us. In Q, Char- 
trand et al. presented a gradient descent on the dual problem of the optimal 
transport problem which produces acceptable results. However, when they 
applied their method to the Lena-Tiffany example, numerical artifacts ap- 
peared as they iterated and their method did not fully converge, as opposed 
to ours. Finally, Haber et al. introduced a projection algorithm on the mass 
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preservation constraint in 13[ which is as of now, probably the best method 
to solve the optimal transport problem. Just like the method we developed 
here, it enjoys efficiency and stability properties. However our method is 
guaranteed to converge in theory (before discretization), thanks to Theorem 
13.11 This is not necessarily guaranteed in their case. 

To pursue this work in the future, the authors would like to extend the 
method to different types of boundary conditions. However, for this to hap- 
pen, we would require global a priori estimates on the Holder norm of the 
solution of the Monge-Ampere equation, which to the best of our knowl- 
edge are not yet available. Moreover, it would be interesting to implement a 
parallel version to increase the performances even more. 
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